Skip to content

feat(io): add reader implementation for BufRead#196

Open
kskalski wants to merge 1 commit intoanza-xyz:masterfrom
kskalski:ks/buf_read
Open

feat(io): add reader implementation for BufRead#196
kskalski wants to merge 1 commit intoanza-xyz:masterfrom
kskalski:ks/buf_read

Conversation

@kskalski
Copy link
Contributor

@kskalski kskalski commented Feb 18, 2026

Introduces BufReadAdapter<R> in wincode::io::std_read module, a low-overhead wrapper over any std::io::BufRead source (e.g. BufReader<R>) implementing wincode::io::Reader.

Mix of consuming and scoped borrowing operations supported by checking consume state on each operation and using one of:

  • Fast path — when the reader’s internal buffer already holds enough bytes, peek_array and take_scoped return a subslice directly with no allocation, copy_into_slice uses read directly into the dst buffer
  • Aux-buffer fallback — when a contiguous slice larger than one currently available from internal reader's fill_buf is requested, bytes are assembled into an aux_buf: Vec<u8>. Subsequent operations can re-serve or extend from this buffer, but switch back to using internal reader once the bytes in aux_buf are consumed.

@kskalski kskalski force-pushed the ks/buf_read branch 9 times, most recently from 75e8eec to b4dd766 Compare March 18, 2026 09:06
@kskalski kskalski marked this pull request as ready for review March 18, 2026 10:33
@kskalski kskalski requested review from cpubot March 18, 2026 10:34
@cpubot
Copy link
Contributor

cpubot commented Mar 22, 2026

The complexity and state management required for this implementation is a strong indication to me that we should actually deprecate peek_* and consume* APIs.

With those removed, implementation can simply become:

use {
crate::io::{ReadResult, Reader},
std::{
io::{BufReader, Read},
mem::{MaybeUninit, transmute},
},
};
pub struct ReadAdapter<R: ?Sized> {
inner: R,
}
impl<R: Read + ?Sized> Reader<'_> for ReadAdapter<R> {
#[inline]
fn copy_into_slice(&mut self, dst: &mut [MaybeUninit<u8>]) -> ReadResult<()> {
Ok(self
.inner
.read_exact(unsafe { transmute::<&mut [MaybeUninit<u8>], &mut [u8]>(dst) })?)
}
}
impl<R: Read + ?Sized> Reader<'_> for BufReader<R> {
#[inline]
fn copy_into_slice(&mut self, dst: &mut [MaybeUninit<u8>]) -> ReadResult<()> {
Ok(self.read_exact(unsafe { transmute::<&mut [MaybeUninit<u8>], &mut [u8]>(dst) })?)
}
}

There is a benchmark in this branch you can run: https://github.com/anza-xyz/wincode/blob/af1904c5d8f83c6c8fa44aa45fba09dae0ffd21f/wincode/benches/reader.rs

The simple implementation without peek_* or consume* methods is faster in every case, and in two cases by a very significant margin. And the implementation in this PR is actually slower than bincode for the SimpleStruct case.

On x86,

The simple solution, relative to the solution in this PR is:

  • ~6x faster in the SimpleStruct case
  • ~3.7x faster in the PodStruct case
  • ~10% faster in the VersionedMessage case.

The simple solution, relative to bincode is:

  • ~4.5x faster in the SimpleStruct case
  • ~38x faster in the PodStruct case (the solution in this PR was much better than bincode here, too)

This isn't to say that we couldn't further optimize the solution proposed here to be closer to the performance of bincode or the solution without peek_* or consume*. But rather that the additional complexity is almost certainly not worth it.

We only have two cases in solana-sdk that are using peek_byte, and both instances are trivial to re-write without it. All implementations in wincode have already been updated such that they never require peek_* or consume* methods. From a library perspective, it doesn't make sense for us to bend over backwards to accommodate those two solana-sdk use-cases. The simple solution adds virtually no maintenance burden, and the removal of peek_* and consume* APIs meaningfully simplify things across the board.


My feeling is that we should merge the context system and deprecate peek_* / consume* APIs in a v0.4.9 release. Now that #220 adds context as an additional trait rather than bundling into SchemaRead, there is no breaking risk.

VersionedMessage can be implemented entirely without context. As seen here

let header = {
let mut reader = unsafe { reader.as_trusted_for(2) }?;
MessageHeader {
num_required_signatures: variant,
num_unsigned_accounts: reader.take_byte()?,
num_signed_accounts: reader.take_byte()?,
}
};
let account_key = <Address as SchemaRead<C>>::get(reader.by_ref())?;
let recent_blockhash = <Hash as SchemaRead<C>>::get(reader.by_ref())?;

And we'll only need to add one context implementation for ShortU16 for this code, which I've already proof-of-concepted here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants